Mining Issues in Traditional Indian Web Documents
نویسندگان
چکیده
منابع مشابه
Research Issues in Web Mining
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web pages, web links, objects on the web and web logs. Web mining is used to understand the custom...
متن کاملMining Domain Specific Words from Web Documents
Web pages provide not only plain text materials for training language models but also tag information for semantics annotation. The tags could be found either explicitly in the HTML documents or implicitly through the directory hierarchy of the documents, since the directory hierarchy can be regarded as a kind of classification tree for web documents, which assigns an implicit hidden tag to eac...
متن کاملMining Web Documents for Unintended Information Revelation
This research concerns web site information security. With an increasing number of documents being generated by different individuals and departments in organizations, there is a potential of releasing information which is inconsistent with the overall goals, objectives and operation of the organization. We refer to this as unintended information revelation (UIR). This paper focuses on progress...
متن کاملResearch Issues in Web Structural Delta Mining
Web structure mining has been a well-researched area during recent years. Based on the observation that data on the web may change at any time in any way, some incremental data mining algorithms have been proposed to update the mining results with the corresponding changes. However, none of the existing web structure mining techniques is able to extract useful and hidden knowledge from the sequ...
متن کاملWeb Mining: Clustering Web Documents A Preliminary Review
Evidently there is a tremendous proliferation in the amount of information found today on the largest shared information source, the World Wide Web (or simply the Web). The process of finding relevant information on the web can be overwhelming. Even with the presence of today’s search engines that index the web it is hard to wade through the large number of returned documents in a response to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Indian Journal of Science and Technology
سال: 2015
ISSN: 0974-5645,0974-6846
DOI: 10.17485/ijst/2015/v8i1/77056